Noise Clustering via Dynamic Data Assigning Assessment

نویسندگان

  • Olga Georgieva
  • Frank Klawonn
  • Katharina Tschumitschew
چکیده

A new clustering algorithm that identifies clusters step by step is introduced. It is based on the principles of noise clustering dividing the data set into a good cluster and the remaining data that might contain only noise or also other clusters. The algorithm can be applied to finding just a few substructures (clusters), but also as an iterative method to data partition including the identification of the number of clusters and noise data. The algorithm is applicable in terms of both hard and fuzzy clustering techniques. An extended variant of the algorithm is developed in order to solve the problem of determining clusters that are sub-clusters of a certain separated cluster. The algorithm is applied to a gene expression data set and finds groups of coregulated genes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Assessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories

In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...

متن کامل

Dynamic data assigning assessment clustering of streaming data

Discovering interesting patterns or substructures in data streams is an important challenge in data mining. Clustering algorithms are very often applied to identify single substructures although they are designed to partition a data set. Another problem of clustering algorithms is that most of them are not designed for data streams. This paper discusses a recently introduced procedure that deal...

متن کامل

Proposing a Novel Cost Sensitive Imbalanced Classification Method based on Hybrid of New Fuzzy Cost Assigning Approaches, Fuzzy Clustering and Evolutionary Algorithms

In this paper, a new hybrid methodology is introduced to design a cost-sensitive fuzzy rule-based classification system. A novel cost metric is proposed based on the combination of three different concepts: Entropy, Gini index and DKM criterion. In order to calculate the effective cost of patterns, a hybrid of fuzzy c-means clustering and particle swarm optimization algorithm is utilized. This ...

متن کامل

Predicting Missing Attribute Values Using k-Means Clustering

Problem statement: Predicting the value for missing attributes is an important data preprocessing problem in data mining and knowledge discovery tasks. Several methods have been proposed to treat missing data and the one used more frequently is deleting instances containing at least one missing value of a feature. When the dataset has minimum number of missing attribute values then we can negle...

متن کامل

Impact of Dynamic Assessment on Iranian EFL Learners' Picture-cued Writing

Abstract In Iran, most English teachers’ method of teaching writing is merely to have students do some writing exercises or simply to give them writing tests without any instruction, but writing is not an easy task for students, and teachers should be able to do more to facilitate their students’ writing. One of the ways to aid writing is dynamic assessment via graduated prompt. The graduated p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005